home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group02b.txt
/
000068_icon-group-sender_Tue Oct 8 07:39:26 2002.msg
< prev
next >
Wrap
Internet Message Format
|
2003-01-02
|
6KB
Return-Path: <icon-group-sender>
Received: (from root@localhost)
by baskerville.CS.Arizona.EDU (8.11.1/8.11.1) id g98EdAA28643
for icon-group-addresses; Tue, 8 Oct 2002 07:39:10 -0700 (MST)
Message-Id: <200210081439.g98EdAA28643@baskerville.CS.Arizona.EDU>
Date: Mon, 7 Oct 2002 16:41:19 -0700 (PDT)
From: Mark Kot <kot@amath.washington.edu>
Subject: Re: icon
To: trutkin@physics.clarku.edu
Cc: icon-group@cs.arizona.edu
Errors-To: icon-group-errors@cs.arizona.edu
Status: RO
>Date: Mon, 7 Oct 2002 18:10:14 -0400 (EDT)
>From: Taybin Rutkin <trutkin@physics.clarku.edu>
>X-Sender: trutkin@planck.clarku.edu
>To: Mark Kot <kot@amath.washington.edu>
>cc: icon-group@CS.Arizona.EDU
>Subject: Re: icon
>MIME-Version: 1.0
>
>On Mon, 7 Oct 2002, Mark Kot wrote:
>
>> iconcise is actually based on a handbook of common
>> verbose phrases that also recommends substitutions.
>> I haven't posted the program for the simple reason
>> that I don't want to cross over from ``fair use''
>> to copyright infringement (since the program, in
>> effect, implements the book).
>
>And what about izipf? Is that this program but with USENET connectivity
>and a history mechanism of some sort?
>
>Taybin
>--
>http://www.piratesvsninjas.com
>
No, izipf is rather different. It's actually a whole suite
of programs. (My colleagues and I were trying to work out
the community ecology of newsgroups.)
The original program was irank. The program takes archived
newsgroups (it also works on email spools), looks for the
``From:'' line and finds the ID of the sender. It then
tallies the number posts from each poster and ranks the
posters. We use this program to generate frequency-rank
or rank abundance distributions.
Here's the code (are we allowed to post code ?):
#########################################################
##### PROGRAM: irank.icn
##### CREATED: 01/05/2001
##### REVISED: 01/05/2001
##### PURPOSE: TALLY AND RANK SOURCES
##### TRY TAG FIRST
##### IF TAG FAILS USE FULL FROM
#########################################################
link strings
procedure main(clargs)
infile := open(clargs[1]) | &input
sources := table(0)
while line := read(infile) do line ?
{
if="From: " then
{
address := map(tab(0))
tags := 0
every word := words(address, ' <>\t\r\n\v\f') do word ?
{
if name := tab(find("@")) then
{
tags +:= 1
sources[name] +:= 1
break
}
}
if tags = 0 then sources[address] +:= 1
}
}
ranked := sort(sources, 2)
every i := 1 to *ranked do write(i, "\t", ranked[*ranked-i+1][2])
end
Amazingly, the rank abundance distribution for most newsgroups follows
Zipf's law.
As I said, there is a whole suite of programs (most of which may
not mean much):
aclass aclass (1) - izipf program to generate an abundance class distribution
bs_a bs_a (1) - izipf program to fit a broken-stick series
bs_ra bs_ra (1) - izipf program to generate broken-stick rank-abundance distribution
bs_s bs_s (1) - izipf program to generate broken-stick spectrum
cofdet cofdet (1) - izipf program to run regressions on many newsgroups
correlate correlate (1) - izipf program for predicted and actual slopes for many newsgroups
counts counts (1) - izipf program to determine newsgroup parameters from irank output
eslope eslope (1) - izipf program for following slopes in a simulation
getdata getdata (1) - izipf program to fetch newsgroup headers from archive
ichi ichi (1) - izipf program for goodness 0f fit to Simon's Yule distribution
iksimon iksimon (1) - izipf KS test for fit to Simon model
iksimons iksimons (1) - izipf KS test for fit to Simon model
iksmspectra iksmspectra (1) - izipf KS test of many real and simulated spectra for many newsgroups
ikspectra ikspectra (1) - izipf KS test of one real and simulated spectrum for many newsgroups
ikspectrum ikspectrum (1) - izipf KS test of real and simulated spectra
inamerank inamerank (1) - izipf program to rank, tally, and name posters
inew inew (1) - izipf program to determine posters as a function of posts
inumber inumber (1) - izipf program to assign a poster number to each post
irank irank (1) - izipf program to produce rank-frequency distribution
itagless itagless (1) - izipf program to find tagless headers
logs_a logs_a (1) - izipf program to fit a logarithmics series
logs_ra logs_ra (1) - izipf program to generate a logarithmic series rank-abundance distribution
logs_s logs_s (1) - izipf program to generate logarithmic spectrum
param param (1) - izipf program to determine newsgroup parameters
params params (1) - izipf program to determine parameters for many newsgroups
simon_says simon_says (1) - izipf program to generate Simon's Yule distribution
simulate simulate (1) - izipf program to plot real and simulated rank-abundance distributions for all newsgroups
spectrum spectrum (1) - izipf program to generate frequency spectrum
sw sw (1) - izipf program to compute Shannon-Weaver diversity index from irank input
tagnumbers tagnumbers (1) - izipf program to count and classify tags for many newsgroups
theory theory (1) - izipf program to predict slopes of Gunther process
That's probably more than you wanted to know.
Mark